MISS: a non-linear methodology based on mutual information for genetic association studies in both population and sib-pairs analysis
نویسندگان
چکیده
MOTIVATION Finding association between genetic variants and phenotypes related to disease has become an important vehicle for the study of complex disorders. In this context, multi-loci genetic association might unravel additional information when compared with single loci search. The main goal of this work is to propose a non-linear methodology based on information theory for finding combinatorial association between multi-SNPs and a given phenotype. RESULTS The proposed methodology, called MISS (mutual information statistical significance), has been integrated jointly with a feature selection algorithm and has been tested on a synthetic dataset with a controlled phenotype and in the particular case of the F7 gene. The MISS methodology has been contrasted with a multiple linear regression (MLR) method used for genetic association in both, a population-based study and a sib-pairs analysis and with the maximum entropy conditional probability modelling (MECPM) method, which searches for predictive multi-locus interactions. Several sets of SNPs within the F7 gene region have been found to show a significant correlation with the FVII levels in blood. The proposed multi-site approach unveils combinations of SNPs that explain more significant information of the phenotype than their individual polymorphisms. MISS is able to find more correlations between SNPs and the phenotype than MLR and MECPM. Most of the marked SNPs appear in the literature as functional variants with real effect on the protein FVII levels in blood. AVAILABILITY The code is available at http://sisbio.recerca.upc.edu/R/MISS_0.2.tar.gz
منابع مشابه
On Classification of Bivariate Distributions Based on Mutual Information
Among all measures of independence between random variables, mutual information is the only one that is based on information theory. Mutual information takes into account of all kinds of dependencies between variables, i.e., both the linear and non-linear dependencies. In this paper we have classified some well-known bivariate distributions into two classes of distributions based on their mutua...
متن کاملInvestigating Financial Crisis Prediction Power using Neural Network and Non-Linear Genetic Algorithm
Bankruptcy is an event with strong impacts on management, shareholders, employees, creditors, customers and other stakeholders, so as bankruptcy challenges the country both socially and economically. Therefore, correct prediction of bankruptcy is of high importance in the financial world. This research intends to investigate financial crisis prediction power using models based on Neural Network...
متن کاملA review on EEG based brain computer interface systems feature extraction methods
The brain – computer interface (BCI) provides a communicational channel between human and machine. Most of these systems are based on brain activities. Brain Computer-Interfacing is a methodology that provides a way for communication with the outside environment using the brain thoughts. The success of this methodology depends on the selection of methods to process the brain signals in each pha...
متن کاملA review on EEG based brain computer interface systems feature extraction methods
The brain – computer interface (BCI) provides a communicational channel between human and machine. Most of these systems are based on brain activities. Brain Computer-Interfacing is a methodology that provides a way for communication with the outside environment using the brain thoughts. The success of this methodology depends on the selection of methods to process the brain signals in each pha...
متن کاملHardy Weinberg Equilibrium Testing and Interpretation: Focus on infection
Hardy-Weinberg equilibrium (HWE) holds when, in a closed population with random mating and without mutation and natural selection, genotype frequencies at any locus is a simple function of allele frequencies. Testing for HWE is now a common practice in population genetics and genetic association studies of non-communicable diseases; however, it is less-regarded, or sometimes miss-interpreted, i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Bioinformatics
دوره 26 15 شماره
صفحات -
تاریخ انتشار 2010